Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use static empty store files metadata #84034

Merged

Conversation

DaveCTurner
Copy link
Contributor

In a large cluster we expect most nodes not to have a copy of most
shards, but today during replica shard allocation we create a new (and
nontrivial) object for each node that has no copy of a shard. With this
commit we check at deserialization time whether the response is empty
and, if so, avoid the unnecessary instantiation.

Relates #77466

In a large cluster we expect most nodes not to have a copy of most
shards, but today during replica shard allocation we create a new (and
nontrivial) object for each node that has no copy of a shard. With this
commit we check at deserialization time whether the response is empty
and, if so, avoid the unnecessary instantiation.

Relates elastic#77466
@DaveCTurner DaveCTurner added >enhancement :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.2.0 labels Feb 16, 2022
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Feb 16, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

commitUserData.put(in.readString(), in.readString());
public static MetadataSnapshot readFrom(StreamInput in) throws IOException {
final int metadataSize = in.readVInt();
final Map<String, StoreFileMetadata> metadata = metadataSize == 0 ? emptyMap() : new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: we might use org.elasticsearch.common.util.Maps#newMapWithExpectedSize to avoid resizing the map

Copy link
Contributor

@idegtiarenko idegtiarenko Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also may be it is worth generalizing this pattern of reading with something like:
Map<K, V> readMapFromList(Writeable.Reader<V> valueReader, Function<V, K> keyCreator) throws IOException that would internally handle map creation, initialization and sizing?

I think this is not the only place where we create map from list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ see 488a05b

Looks like we are not properly sizing the map produced by StreamInput#readMap and StreamInput#readOrderedMap either. I'm not touching that in this PR, nor the map-from-list thing, but seems reasonable I think.

Copy link
Contributor

@idegtiarenko idegtiarenko Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#84045 is a pr for pre-sizing maps in readMap. We could discuss if there any concerns around that topic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also open a followup pr on map-from-list

@DaveCTurner DaveCTurner merged commit 1b7b2a1 into elastic:master Feb 17, 2022
@DaveCTurner DaveCTurner deleted the 2022-02-16-reuse-empty-file-metadata branch February 17, 2022 10:24
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Feb 18, 2022
* upstream/master: (167 commits)
  Mute FrozenSearchableSnapshotsIntegTests#testCreateAndRestorePartialSearchableSnapshot
  Mute LdapSessionFactoryTests#testSslTrustIsReloaded
  Fix spotless violation from last commit
  Mute GeoGridTilerTestCase#testGeoGridSetValuesBoundingBoxes_UnboundedGeoShapeCellValues
  Small formatting clean up (elastic#84144)
  Always re-run Feature migrations which have encountered errors (elastic#83918)
  [DOCS] Clarify `orientation` usage for WKT and GeoJSON polygons (elastic#84025)
  Group field caps response by index mapping hash (elastic#83494)
  Shrink join queries in slow log (elastic#83914)
  TSDB: Reject the nested object fields that are configured time_series_dimension (elastic#83920)
  [DOCS] Remove note about partial response from Bulk API docs (elastic#84053)
  Allow regular data streams to be migrated to tsdb data streams. (elastic#83843)
  [DOCS] Fix `ignore_unavailable` parameter definition (elastic#84071)
  Make Metadata extend AbstractCollection (elastic#83791)
  Add API specs for OpenID Connect APIs
  Revert "Clean up for superuser role name references (elastic#83627)" (elastic#84096)
  Update Lucene analysis base url (elastic#84094)
  Avoid null threadContext in ResultDeduplicator (elastic#84093)
  Use static empty store files metadata (elastic#84034)
  Preserve context in snapshotDeletionListeners (elastic#84089)
  ...

# Conflicts:
#	x-pack/plugin/rollup/build.gradle
probakowski pushed a commit to probakowski/elasticsearch that referenced this pull request Feb 23, 2022
In a large cluster we expect most nodes not to have a copy of most
shards, but today during replica shard allocation we create a new (and
nontrivial) object for each node that has no copy of a shard. With this
commit we check at deserialization time whether the response is empty
and, if so, avoid the unnecessary instantiation.

Relates elastic#77466
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 24, 2022
`TransportNodesListShardStoreMetadata$StoreFilesMetadata` and
`Store$MetadataSnapshot` are both morally-speaking records, and
`LoadedMetadata` is really the same as `MetadataSnapshot`. This commit
turns them into real records, gets rid of the unnecessary extra class,
and renames some of the accessors.

Spotted while working on elastic#84034
DaveCTurner added a commit that referenced this pull request Feb 28, 2022
`TransportNodesListShardStoreMetadata$StoreFilesMetadata` and
`Store$MetadataSnapshot` are both morally-speaking records, and
`LoadedMetadata` is really the same as `MetadataSnapshot`. This commit
turns them into real records, gets rid of the unnecessary extra class,
and renames some of the accessors.

Spotted while working on #84034
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants